3 Common Distributions

1 Distribution Derived from 0-1 Sequence

Consider a sequence of 0 s and 1 s. (call this Bernoulli Process) Denote each index a RV Xi,i=1,,n.

1.1 Bernoulli Distribution

If XBernoulli(p), then P(X=1)=p,P(X=0)=1p. (0<p<1)
E[X]=p,Var(X)=E[X2]E[X]2=p(1p).

If X1,,Xni.i.dBernoulli(p), then call this a Bernoulli process. Let Sn=X1++Xn. This leads to:

1.2 Binomial Distribution

SnBinomial(n,p), then P(Sn=k)=(nk)pk(1p)nk.
By linearity of expectation, E[Sn]=E[i=1nXi]=i=1nE[Xi]=np,
By independence of X1,,Xn and (3.1), Var(Sn)=i=1nVar(Xi)=np(1p).

1.3 Geometric Distribution

Denote Wi as the waiting time between the (i1) th and i th successes. Then WiGeometric(p). P(W=k)=(1p)k1p,kN.
Then E[W]=k=1k(1p)k1p=1p,Var(W)=E[W2]E[W]2=k=1k2(1p)k1p(1p)2=1pp2.

These moments can be computed more easily using MGF.
By here we know the MGF for WGeometric(p) is MW(t)=pet1(1p)et, so E[W]=ddtMW(t)|t=0=pet(1(1p)et)2|t=0=1p,E[W2]=d2dt2MW(t)|t=0=pet(1(1p)et)3|t=0=1p2.

1.4 Negative Binomial Distribution

rN. Use Tr to denote the total waiting time to the r th success. Then Tr=W1++Wr, W1,,Wri.i.dGeometric(p). Then P(Tr=n)=(n1r1)pr1(1p)nrp.
Use Fr to denote number of failures before r th success, then Fr+r=Tr. Denote FrNegativeBinomial(r,p). P(Fr=k)=(r+k1r1)pr(1p)k=(r+k1k)pr(1p)k.
Then E[Fr]=E[Tr]r=rpr=r(1p)p,Var(Fr)=Var(Trr)=Var(Tr)=rVar(W1)=r(1p)p2.

2 Hypergeometric Distribution

There are B blue balls and R red balls. N=B+R. Sample size n<N.

If sample n balls u.a.r with replacement, let X(ω) be the number of blue balls in ωΩ (sample space). Then XBinomial(n,p), p=BN.

If sample n balls u.a.r without replacement, let X(ω) be number of blue balls. Obviously, when k>B or nk>R, P(X=k)=0.
For max{0,nR}kB, consider ω as an event that X=k. Denote q=P({ω})=B!(Bk)!R!(R(nk))!N!(Nn)!.

So there are (nk) distinct sequences with k blue balls and nk red balls. So P(X=k)=(nk)q=(Bk)(NBnk)(Nn).
This distribution is called Hypergeometric Distribution, XHypergeometric(N,B,n).

P(X=k)(nk)pk(1p)nk as B,N,BNp.

2.1 Expectation

Define indicator RV Ii(ω)={1,i-th draw is blue for ωΩ,0,otherwise.
Then X=I1++In, (I1,,In) is exchangeable. So P(Ii=1)=P(I1=1)=BN=E[I1], so E[X]=i=1nE[Ii]=nE[I1]=nBN.

This is the same as expectation of Binomial(n,BN).

2.2 Variance

By exchangeability,

Var(X)=Cov(I1++In,I1++In)=i=1nVar(Ii)+ijCov(Ii,Ij)=nVar(I1)+n(n1)Cov(I1,I2).

Since Var(I1)=BN(1BN),Cov(I1,I2)=E[I1I2]E[I1]E[I2],E[I1I2]=P(I1=I2=1)=BN(B1N1),
then Cov(I1,I2)=B(NB)N2(N1)<0, and Var(X)=nBN(NBN)(NnN1).

Yellow part is the variance of Binomial(n,BN); green part 1.

3 Multinomial Distribution

Consider repeated trials. Each with m types of outcome. (For Bernoulli trials, m=2; for rolling a dice, m=6)
Denote N as number of trials, Xi as number of times type i is observed.
Denote (X1,,Xm)=X (X1,,Xm are not independent given N=n), then (X1,,Xm)|N=nMultinomial(n,p1,,pm), with pj[0,1],j=1mpj=1.
A possible outcome a=(a1,,am) should satisfy j=1maj=n,aj{0,,n}. So P(X=a|N=n)=(na1,,am)p1a1pmam, where (na1,,am)=n!a1!am!.


Poissonization of the Multinomial
Now suppose number of trials is a RV NPoisson(λ). We want to determine distribution of (X1,,Xm). So P(X=a)=n=0P(X=a|N=n)P(N=n)=n=0n!a1!am!p1a1pmam1{j=1maj=n}λnn!eλ=p1a1pmama1!am!λa1++ameλ(p1++pm)=[(p1λ)a1a1!ep1λ][(pmλ)amam!epmλ].
So if NPoisson(λ),

  1. X1,,Xm are independent.
  2. XjPoisson(pjλ),

Then XMultinomial(N,p1,,pm).